Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-GPU Computing for Achieving Speedup in Real-time Aggregate Risk Analysis

Stochastic simulation techniques employed for portfolio risk analysis, often referred to as Aggregate Risk Analysis, can benefit from exploiting state-of-the-art highperformance computing platforms. In this paper, we propose parallel methods to speedup aggregate risk analysis for supporting real-time pricing. To achieve this an algorithm for analysing aggregate risk is proposed and implemented ...

متن کامل

tight frame approximation for multi-frames and super-frames

در این پایان نامه یک مولد برای چند قاب یا ابر قاب تولید شده تحت عمل نمایش یکانی تصویر برای گروه های شمارش پذیر گسسته بررسی خواهد شد. مثال هایی از این قاب ها چند قاب های گابور، ابرقاب های گابور و قاب هایی برای زیرفضاهای انتقال پایاست. نشان می دهیم که مولد چند قاب تنک نرمال شده (ابرقاب) یکتا وجود دارد به طوری که مینیمم فاصله را از ان دارد. همچنین مسایل مشابه برای قاب های دوگان مطرح شده و برخی ...

15 صفحه اول

FPGA-Based Fuzzy Inference System for Real-time Embedded Applications

The traditional way of implementing algorithms in software limits the performance of real-time systems, since the data is processed serially. The new generation of FPGAs with embedded processors are attracting the interest of the real-time applications. With enhanced capabilities most of the processing tasks can be loaded from the software program stack to embedded processors on the FPGA to imp...

متن کامل

Real-Time Scheduling of Linear Speedup Parallel Tasks

In this paper a problem of deterministic scheduling parallel applications in a multiprogrammed multiprocessor system is considered. We address the preemptive case. The number of processors used by a task can change over time. Any task can be executed with linear speedup on a number of processors not greater than some task-dependent constant. This problem can be solved by a low-order polynomial ...

متن کامل

Achieving Linear Speedup in Parallel LRU Cache Simulation

Previous work on simulation of LRU caching led to the development of parallel algorithms that are efficient for small numbers of processors. However, these algorithms exhibit a sub-linear speedup, where the efficiency seriously decreases with a higher number of processors. In order to achieve linear speedup, this work proposes the use of approximation techniques with the existing parallelizatio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Embedded Computing Systems

سال: 2019

ISSN: 1539-9087,1558-3465

DOI: 10.1145/3358192